Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction
نویسندگان
چکیده
We report our new development of a hidden trajectory model for co-articulated, time-varying patterns of speech. The model uses bi-directional filtering of vocal tract resonance targets to jointly represent contextual variation and phonetic reduction in speech acoustics. A novel maximum-likelihood-based learning algorithm is presented that accurately estimates the distributional parameters of the resonance targets. The results of the estimates are analyzed and shown to be consistent with all the relevant acoustic-phonetic facts and intuitions. Phonetic recognition experiments demonstrate that the model with more rigorous target training outperforms the most recent earlier version of the model, producing 17.5% fewer errors in N-best rescoring.
منابع مشابه
Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation
A novel speaker-adaptive learning algorithm is developed and evaluated for a hidden trajectory model of speech coarticulation and reduction. Central to this model is the process of bi-directional (forward and backward) filtering of the vocal tract resonance (VTR) target sequence. The VTR targets are key parameters of the model that control the hidden VTR’s dynamic behavior and the subsequent ac...
متن کاملNovel Acoustic Modeling with Structured Hidden Dynamics for Speech Coarticulation and Reduction
We report in this paper our recent progress on the new development, implementation, and evaluation of the structured speech model with statistically characterized hidden trajectories. Unidirectionality in coarticulation modeling in such hidden trajectory models as presented in previous EARS workshops has been extended to bi-directionality (forward as well as backward in the temporal dimension),...
متن کاملCoarticulation modeling by embedding a target-directed hidden trajectory model into HMM - model and training
We propose and evaluate a new acoustic model that combines HMM and a special type of the hidden dynamic model (HDM) – a target-directed hidden trajectory model – into a single integrated model named HTHMM. The new model provides a computational model of coarticulation by representing the internal dynamics of human speech based on the hidden trajectory of the vocal-tract resonances. This paper f...
متن کاملHandling phonetic context and speaker variation in a structure-based speech recognizer
Recently we have developed a novel type of structure-based speech recognizer, which uses parameterized, non-recursive “hidden” trajectory model of vocal tract resonances (VTR) or formants to capture the dynamic structure of long-range speech coarticulation and reduction. The underlying model of this recognizer carries out bi-directional FIR filtering on the piecewise constant sequences of the V...
متن کاملTowards an improved model of dynamics for speech recognition and synthesis
This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech wit...
متن کامل